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ABSTRACT 

Deep sequencing studies frequently identify small 
RNA fragments of abundant RNAs. These fragments 
are thought to represent degradation products of 
their precursors. Using sequencing, computational 
analysis, and sensitive northern blot assays, we 
show that constitutively expressed non-coding 
RNAs such as tRNAs, snoRNAs, rRNAs and 
snRNAs preferentially produce small 5' and 3' end 
fragments. Similar to that of microRNA processing, 
these terminal fragments are generated in an asym- 
metric manner that predominantly favors either the 
5' or 3' end. Terminal-specific and asymmetric pro- 
cessing of these small RNAs occurs in both mouse 
and human cells. In addition to the known process- 
ing of some 3' terminal tRNA-derived fragments 
(tRFs) by the RNase III endonuclease Dicer, we 
show that several RNase family members can 
produce tRFs, including Angiogenin that cleaves 
the Tv|/C loop to generate 3' tRFs. The 3' terminal 
tRFs but not the 5' tRFs are highly complementary 
to human endogenous retroviral sequences in the 
genome. Despite their independence from Dicer 
processing, these tRFs associate with Ago2 and 
are capable of down regulating target genes by 
transcript cleavage in vitro. We suggest that en- 
dogenous 3' tRFs have a role in regulating the un- 
warranted expression of endogenous viruses 
through the RNA interference pathway. 



INTRODUCTION 

Recent applications of deep sequencing methods have led 
to the identification of a surprising diversity of non- 
protein-coding RNAs (ncRNAs), including degradation- 
hke small RNA fragments derived from miRNAs (1,2), 
snoRNAs (3-5) and tRNAs (6-12). Evidence is 
accumulating that these small RNA fragments are pre- 
cisely processed to participate in diverse biological 
processes and are conserved in distantly related species 
(2,3,5-7,13). The first report of a small snoRNA-derived 
RNA and its function as a miRNA was reported 3 years 
ago (3). Since then, ~10 additional snoRNA-derived small 
(>18nt) RNAs that can function as miRNAs have been 
described (5). We have also reported a group of human 
and viral unusually small (~15nt) RNAs (usRNAs) 
that are derived from miRNAs as well as other 
non-coding regions, and can regulate gene (e.g., RAD21) 
expression (2). Production of these usRNAs are evolution- 
arily conserved (13,14) and are associated with 
hippocampal functions in mice (13). 

It is common to find small RNA fragments that match 
long RNA transcripts such as mRNAs in small RNA 
sequencing data (8). The presence of such small RNA frag- 
ments is generally attributed to the extreme sensitivity of 
deep sequencers in detecting low-abundance RNAs 
including degradation products. However, even very low- 
abundance RNAs, as low as four copies per cell, originating 
from the Cychn Dl (CCNDl) promoter region are reported 
to have regulatory functions (15). While many of the RNA 
fragments in deep-sequencing data frequently correspond 
to low abundance, potentially non-functional RNA 
products, RNA fragments found at even greater abundance 
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than most miRNAs are also frequently observed in small 
RNA sequencing data (2,7). Common sources for such 
RNA fragments are tRNAs and rRNAs. Since tRNAs 
and rRNAs constitute most of the cellular RNA, it is rea- 
sonable to assume that these RNAs generate much more 
degradation products than other RNAs, resulting in the 
preferential detection of their degradation products in 
deep-sequencing. While the notion of degradation 
products is appeahng, it was recently shown that tRNAs 
and rRNAs undergo stress-induced cleavage to produce 
stable RNA products, and this mechanism is conserved 
from yeast to human cells (1 1). In human cells, tRNA frag- 
ments have been reported to induce transient translational 
arrest (12). Various tRNA fragments are produced in ceUs 
under stress; notably during starvation in Tetrahymena 
thermophila (9) and serum deprivation in Giardia lambia 
(10), or during development in Aspergillus fumigatus (16). 
Stress-induced tRNA cleavage is mediated by Rnylp in 
yeast (17) and angiogenin (ANG) in humans (12,18). 
These stress-induced tRNA cleavage products are 
30-50 nt in length and seem to form related RNA classes, 
termed sitRNA (10), tiRNA (12,19) and tRNA halves 
(11,16,18). An additional class of small RNAs that are 
~20— 30nt long, and are derived from 5' or 3' ends of 
tRNAs or from the genomic region foUowing the 3' end 
of tRNAs, broadly termed tRNA fragments (tRFs) were 
also reported (7). Subsequently, another class of seemingly 
related tRNA-derived small RNAs (tsRNAs) was found to 
function similarly to miRNAs. These tsRNAs are 
processed by either Dicer or RNase Z, depending on the 
tRF locations within both the mature tRNAs and their 
precursors (6). In summary, the widespread occurrence of 
stable small RNAs and their emerging roles in cellular 
processes indicate that we cannot ignore such molecules 
as randomly generated degradation products. Systematic 
analysis of all sequence reads from deep sequencing data 
may prove useful in defining new biological features and 
mechanisms relating to both canonical and non-canonical 
forms of smaU RNAs. 

In this study, we found that the majority of the ~20 nt 
long fragments deriving from the mature sequences of the 
widely expressed tRNAs, rRNAs, snoRNAs and snRNAs, 
but not mRNAs, are produced in a specific cleavage pattern 
from the 5' or 3' ends. We note that this pattern was previ- 
ously observed for snoRNAs (3-5), and tRNAs (6-8). 
Similar to the processing of miRNAs, these terminal 
RNAs accumulate in the cell in an asymmetric manner 
that favors the expression of either the 5' or the 3' end frag- 
ments. The terminal fragments, are largely independent of 
canonical RNA interference (RNAi) processing 
machineries, which include Dicer that processes precursor 
miRNAs (pre-miRNAs), and DGCR8, which partners 
with Drosha to form the microprocessor complex that 
processes primary miRNA transcripts to pre-miRNAs. 
The biogenesis of tRFs is likely to depend on multiple 
RNase family members, including ANG that cleaves 
tRNAs within the Tv|/C loop in a sequence-specific 
manner. Notably, antisense sequences of 3'-terminal tRFs 
are highly enriched in genomic regions harboring retroviral 
sequences, and 3' tRFs are found to cleave target RNA 
through endogenous association with Ago2. 



MATERIALS AND METHODS 

Small RNA sequence analysis 

Small RNAs reads from the B lymphoma BCPl cell fine 
were processed and mapped to the transcriptome as 
reported earher (2). SmaU RNA sequences from mouse 
embryonic stem cells and HEK293 cells were retrieved 
from NCBI (GSE 12521 and GSE 16579). A total of 522 
tRNA genes were curated from gtRNAdb (20), excluding 
109 predicted pseudo tRNA genes. Perfect matches 
between reads and tRNAs were imposed to obtain the 
most rehable hst of tRNA-derived RNAs. To eliminate 
any bias in our analysis due to terminal CCA motifs 
that are known to occur in tRNAs, the terminal CCA of 
tRNAs were masked. For comparison of the abundance of 
miRNA and different classes of terminal RNA in Dicer or 
DGCR8 knockout hbraries, reads from different libraries 
were normalized using the total number of niRNA 
fragments in each library. For normalization, the total 
number of mRNA fragments (>15nt) in each hbrary 
was used to first divide the read counts of each distinct 
RNA in that library, and then scaled by multiplying by the 
total number of mRNA reads in the reference (wild-type) 
library. To focus on the most reproducible set of tRFs, a 
threshold of 10 reads/niilhon was used for the analysis of 
tRFs (Figures 3A, 4B, C and 8B; Supplementary 
Figure S5). 

To detect putative tRNA orthologs, we identified 137 
human and mouse tRNA pairs with significantly high 
similarity (>80%) over >90% of the tRNA sequences. 
For analysis of retrotransposons, coordinates for human 
LTR, LINE and SINE elements were downloaded from 
UCSC genome database (21). To be considered as a 
potential tRF binding site, we required perfect match of 
the antisense sequence to the LTR, LINE and SINE 
elements. To eliminate artifacts due to the different sizes 
of LTR, LINE and SINE regions, the number of putative 
binding sites was normalized with respect to the total 
length of each of those elements. Since the length of the 
regions spanned by LINE and SINE elements were larger 
than LTR elements, the appropriate normalization factors 
of 0.42 and 0.67 were used to multiply the number of 
binding sites in LINE and SINE elements, respectively. 
All tRF sequences identified in BCPl and were at least 
10 nt long were used for the analysis. Since tRNAs can 
share similar terminal sequences, the terminal sequences 
that match to the most anti-LTR tRFs were selected as 
representative regions for illustrations. 

Northern blot analysis 

Northern blots were based on our recently pubHshed 
protocol (22). Briefly, total RNA (10-50 |.ig) was separated 
on 18% denaturing polyacrylamide gels and electro- 
transferred to positively charged nylon membranes from 
Roche (Roche Applied Science, Indianapolis, IN, USA). 
LNA-DNA mixed oligonucleotide probes were synthesized 
by IDT (Integrated DNA Technologies, Inc. Coralville, lA, 
USA). Probes were labeled with the small molecule 
Digoxigenin (DIG) using Ohgo End Tailing Kit (Roche 
Applied Science, Indianapohs, IN, USA). Hybridization 
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was carried out overnight in Ultrahyb buffer (Ambion, 
Austin, TX, USA) at 42°C and washing was performed in 
two rounds of 0.1 x SSC/0.1%SDS at 60°C. DIG signals 
were detected with alkaline phosphatase-conjugated 
anti-DlG antibody (Cat. 11093274910, Roche). Probes 
used for northern blotting corresponds to (LNAs in bold): 
3' end tRF of tRNA LysTTT (5'-TGGCGCCCGAACAGG 
A-3'); 5' end tRF of tRNA LysTTT (5'-CTGAGCTATCCG 
GGAA-3'); 3' end tRF of tRNA PheGAA (5'-GTGCCGAA 
ACCCG-3'); 5' end tRF of tRNA PheGAA (CTCCCA 
ACTGAGCTATTTCG); miR-17 control RNA (5'-CTAC 
CTGCACTGTAAGCACTTTG-3'), 5.8S rRNA (5'-TGAT 
CCACCGCTAAGAGT-3'); U6 snRNA (5'-ATATGGAA 
CGCTTCACGAATT-3'). 

For Ago2-association studies, immunoprecipitated 
RNA and S^g total RNA isolated from HEK293 cells 
using Trifast (Peqlab) was separated by 12% denaturing 
RNA PAGE and transferred to a nylon membrane (GE 
Healthcare) by semidry electroblotting. Membranes were 
cross-linked by l-ethyl-3-(3-dimetliylaminopropyl) carbodii- 
niide (EDC) chemical crosslink incubating for 1 h at 50° C 
and hybridized overnight at 50° C with the following probes 
5'-TGGTGCCGTGACTCGGA-3' (tRNA HisGTG), 5'-AT 
GGTGTCAGGAGTGGGA-3' (tRNA LeuCAG) and 
5'-TCACAAGTTAGGGTCTCAGGGA complementary 
to miR-125b. 

In vitro cleavage assay 

Recombinant human ANG (Sigma- Aldrich) was dissolved 
and diluted to a final concentration of 10 |ig/ml in 30 mM 
HEPES, 30 mM NaCl and 0.01% BSA. The working 
buffer in absence of ANG was used as the control 
buffer. RNase A was purchased from Qiagen (Valencia, 
CA, USA). RNase I and RNase Tl were purchased from 
Ambion (Carlsbad, CA, USA). Incubations of total RNA 
were carried out at 37°C for 0, 1, 2 and 4h. Incubations of 
in vitro transcribed RNA were carried out at 37°C for 
0.5 h. The incubation was stopped by adding Trizol 
(Invitrogen) into the reaction mixture from which RNA 
was extracted. To help with RNA precipitation, 20|ig 
glycogen (Life Technologies, Carlsbad, CA, USA) was 
added to each sample. 

Ago immunoprecipitation and cleavage assay 

The construction of human FLAG/HA-Agol and FLAG/ 
HA-Ago2, was reported earlier (23). For RNA cleavage 
assay, a synthetic transcript was generated based PCR 
amplification and in vitro transcription (T7 RNA polymer- 
ase, Fermentas). Primers used were: 5'-TAATACGACTC 
ACTATAGAACAATTGCTTTTACAG-3' (T7 primer), 
5'-ATTTAGGTGACACTA TAGGCATAAAGAATTG 
AAGA-3' (SP6 primer). Target sequences were 5'-GAAC 
AATTGCTTTTACAGATGCACATATCGAGGTGA 
ACATCACGTACG TGGTGCCGTGACTCGGA TCG 
GTTGGCAGAAGCTAT-3' (tRNA HisGTG) and 5'-G 
AACAATTGCTTTTACAGA TGCACATATCGAGG 
TGAACATCACGTACG ATGGTGTCAGGAGTGGG 
ATCGGTTGGCAGAAGCTAT-3' (tRNA LeuCAG) 
and 5'-GGCATAAAGAATTGAAGAGAGTTTTCAC 
TGCA TACGACGATTCTGTGATTTGTATTCAGC 



CCATATCGTTTCATAGCTTCTGCCAACCGA for 
the sequence complementary to the ohgonucleotide with 
the specific cleavage sequence. PCR products were 
purified on denaturing RNA-PAGE followed by etha- 
nol precipitation. For RISC activity assays, substrates 
were ''^P-cap labeled as follows: 1.5 fxl (a-^^P)-GTP 
(3000 Ci/mmol), 1\A lOx buffer (0.4 M Tris pH 8.0, 
60 mM MgCL, 100 mM DTT, 20 mM spermidine), 0.25^1 
RNasin (Promega), 1 \\\ 5-adenosyl-Met (500 nM), 1 \x\ 
DTT (IM), l^il Guanyltransferase and RNA from 
in vitro transcription reaction was incubated for 3h at 
37°C. RNA was purified by denaturing RNA-PAGE and 
recovered by ethanol precipitation. Ago complex- 
containing anti-FLAG beads were incubated in a reaction 
containing 5 nM target RNA, 1 mM ATP, 0.2 mM GTP, 
lOU/ml RNasin (Promega), 100 niM KCl, 1.5 mM MgCh 
and 0.5 niM DTT for 1.5 h at 30°C. The reaction was 
stopped by proteinase K treatment and RNA was isolated 
by phenol/chloroform extraction and ethanol precipitation. 
Cleavage products were analysed by denaturing RNA- 
PAGE followed by autoradiography using BioMax MS 
film (Kodak) and an intensifying screen (Kodak). 

Production of wild-type and mutated tRNA molecules 

To ensure minimum structural disruptions in mutated 
tRNA (LysTTT) sequences, the RNAs were folded by 
Vienna RNA Package (24), and then manually inspected 
using VARNA (25). Wild-type and mutated tRNA se- 
quences were constructed using mirVana miRNA Probe 
Construction kit (Ambion, Austin, TX, USA). Briefly, 
wild-type or mutated tRNA DNA oligos were synthesized 
(IDT, lA, USA) and annealed to a shorter DNA oHgo 
harboring T7 promoter sequence. The partial double- 
stranded DNA was filled in by Klenow polymerase and 
then subjected to in vitro transcription by T7 RNA poly- 
merase. Synthesized RNA was purified by Trizol 
(Invitrogen), before performing in vitro cleavage assays. 

RESULTS 

Constitutive ncRNAs but not mRNAs generate 
termini-specific RNAs 

We previously reported on ~15nt long unusually small 
RNAs (usRNAs) and their sub-groups (2), consisting of 
3' end-specific motifs in human and viral genomes. Two 
of these usRNA subgroups share a CCA element in their 
3' end. To further study RNAs with the terminal CCA 
motif, we reanalyzed 307 269 RNA sequence reads 
(10— 30nt) that were identified in the human 
KSHV-infected primary-effusion lymphoma cell line, 
BCPl (26). The CCA motif occurs precisely at the 3' end 
in 24 601 reads (4482 distinct sequences. Figure lA). 
Terminal, 3'-specific CCA motifs are known to occur in 
almost all mature tRNAs (~70— lOOnt), which act as sites 
for amino acid residue attachment and as stabilizers for 
tRNA-ribosome interactions (27). Since the majority 
(85.6%) of reads containing terminal CCA motifs con- 
sisted of usRNAs or even smaller RNAs that have not 
been the focus of previous studies (6,7,10-12,16,18,19), 
we sought to fully characterize the spectra of ~ 10—30 
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Figure 1. Characteristics of small RNAs (sRNAs) matching to 5' and 3' termini of tRNAs. All positions are determined with respect to either the 
5' or 3' ends of sRNAs/tRNAs. To determine the positions, the sRNAs (violet) and tRNAs (blue) are aligned by their 5'-3' ends, as noted by 
respective illustrations. (A) The CCA motif preferentially occurs at the 3' termini (position —1) of the sRNAs. To avoid artifacts from single 
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reads, 5' terminal of tRNAs are also specifically processed to yield RNA fragments (5' terminal peak). The tRNA position (with respect to its 5' end) 
at which the 5' end of the sRNA reads are matched are used, as depicted by the bottom illustration. (D) Length distribution of 5' (black) and 
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broader range (~16— 18nt). 



base long terminal CCA-containing RNAs. To reduce 
artifacts from short length reads, the analysis was 
limited to those reads that perfectly matched tRNA 
ends. More than half (13 579 of 24 601 reads) of the 
terminal CCA-containing small RNAs matched tRNAs. 
We next tested whether tRNAs produce other 10— 30nt 
long RNAs at other positions in tRNAs in an unbiased 
manner as expected for a degradation process involving 
random exo/endo-nuclease cleavage. The majority of tRFs 
precisely match 5' or 3' ends of tRNAs (Figure IB and C) 
and these 5-3' terminal tRFs matches to 76% (447/522) of 
human tRNAs. We find a significantly higher proportion 
of tRFs reads (76%) than previously reported (7) for tRFs 
(~55%), likely because gel-based isolation of RNAs 
>10nt seem to be necessary to retain the full spectrum 
of usRNAs (2). Interestingly, 5' and 3' tRFs display 
different size distributions (Figure ID). The size range of 
5' terminal tRFs peaks at 14 and 15 nt, which corresponds 
to a smaller size estimate than the previous report of 
18— 19nt (7); this difference may be due to our inclusion 
of a broader range (10— 30nt) of RNA fragments, or is 



a result of differences between the cell lines in the two 
studies. 

We next examined if other classes of RNAs also generate 
termini-specific small RNAs. The phenomenon of 5'-3'- 
specific processing is observed across all major classes of 
ncRNAs but not mRNAs (Figure 2). Terminal tRFs are the 
most abundant fragments found in deep sequencing, 
followed by rRNA-derived RNAs (Supplementary 
Figure SI; Figure IB and C). All ncRNAs generate both 
5' and 3' products except snRNAs, which produce small 
RNAs exclusively from their 3' termini. We note that 
5' modifications on mRNAs and snRNAs (28) hinder the 
cloning of 5' fragments, which might be the most plausible 
reason for the absence of 5' snRNA-derived reads in deep 
sequencing data. However, despite the large number of 
mRNAs, very few mRNA-derived fragments are detected 
at 5'-3' ends and the proportion of mRNA-derived 
fragments across all mRNAs is neghgible (Figure 2) sug- 
gesting that the processing of mRNAs into stable fragments 
is not a common phenomenon. In comparison to 
tRN A-derived tRFs with a read number density of ~ 1 .0 
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read per base, niRNAs yield a noticeably low read density 
of 0.0005. Furthermore, there were no reads with purely 
poly(A) tails of at least 15nt long sequences, suggesting 
that end processing of mRNAs is a rare event. Although 
termini-specific stable fragments of mRNAs are not 
detected in our analysis, it is important to note that very 
low abundance (1 copy/ 10 cells) short RNAs also occur 
near transcription start sites of mRNAs (29,30). These tran- 
scription start site-associated (29) RNAs (TSSa-RNAs) or 
transcription initiation (30) RNAs (tiRNAs) are distinct 
from termini-specific RNA fragments, because they map 
to a wide range of up/down-stream regions near transcrip- 
tion start sites. Thus, small RNAs from ncRNA termini are 
likely to be produced by a different mechanism than that 
used for TSSa-RNA and tiRNA generation, as well as 
mRNA degradation. 

Asymmetric processing of 5'-3' terminal fragments 

During miRNA biogenesis, each miRNA precursor is pro- 
cessed to generate a small (~22 nt) duplex, from which one 
of the strands (miRNA) is preferentially incorporated into 
Argonaute (AGO) and thus stabilized, while the other 
strand, termed the passenger strand is often degraded 
(31,32). Since the 5' and 3' ends of both tRNAs (33) and 
many snoRNAs (28) form RNA duplexes, we speculated 
that the terminal RNAs may be generated asymmetrically 
from the 5' and 3' ends of the duplex. To explore this 
possibility, for each tRNA, we compared the number of 
5' terminal RNAs to that of 3' terminal RNAs. Among 
the 447 tRNA genes that match terminal tRFs, 335 (75%) 
tRNAs match both 5' and 3' terminal tRFs. Remarkably, 
93% of these (313/335) tRNAs manifest considerable (2x) 
difference between their 5' and 3' tRF levels, while nearly 
half of them (161) yield a difference of 10 x. To ensure that 
these biases are not due to tRNAs that are similar in their 
sequences, we reanalyzed our data using a non-redundant 
set of 99 tRNAs that shared <90% sequence identity to 
reconfirm the observed biases (Figure 3A). To further 



confimi that the biases are not limited to our BCPl library 
and to rule out protocol-based artifacts, we also analyzed 
the results from an independent library from HEK293 cells 
(34). We found similar patterns that were additionally con- 
firmed using northern blots for four tRFs derived from two 
different tRNAs (Supplementary Figure S2). These results 
support the notion that tRNAs are generally processed into 
two terminal tRFs but only one fragment is preferentially 
maintained in cells, reminiscent of miRNA maturation. The 
bias is stronger for snoRNAs and snRNAs; the 95% (66/69) 
snoRNAs generate terminal RNAs primarily from either 
the 5' or 3' ends (Figure 3B), while all terminal snRNA frag- 
ments are exclusively derived from 3' ends (Figure 2). 
Terminal RNAs of rRNAs also seem to manifest a general 
preference toward 3' terminal (Figure 3C). Since 5' and 
3' ends of rRNAs likely do not form intra-molecular base 
pairs to form a double-stranded structure within the 
molecule (35), the asymmetry in ncRNA terminal RNAs 
could be due to stabilization of the 5' or 3' fragment that 
is cleaved off by the processing machinery. 

We next sought to find out whether the terminal RNAs 
and their asymmetric processing bias are characteristics 
that are evolutionarily conserved. Analysis of small RNA 
Ubraries (36) from mouse embryonic stem cells (mESCs) 
revealed that niESCs contain very abundant terminal 
RNAs derived precisely from the 5' and 3' ends of 
tRNAs (Figure 4A) that also manifest asymmetry in pro- 
cessing (Figure 4B). Similar to human terminal tRFs, 
termini-specific processing (Supplementary Figure S3) 
and asymmetric stabilization (Supplementary Figure S4) 
are also observed for mouse rRNAs, snoRNAs and 
snRNAs. Furthermore, comparison of the asymmetric 
processing bias of individual human tRNAs and their 
putative mouse orthologs reveal a modest conservation 
in their processing bias in generating terminal 
tRFs (Supplementary Figure S5). Intriguingly, the process- 
ing biases of 20— 30nt long tRFs are more cor- 
related (/' = 0.67) between human and mice than that of 
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the 15- to 19-nt-long tRFs. Taken together, these data 
suggest that the mechanism to generate terminal RNAs 
from various ncRNAs is a conserved process between 
human and mice. 

Terminal small RNAs are primarily independent 
of Dicer and DGCR8 

Since a few Dicer-dependent RNA products from tRNAs 
(8) and snoRNAs (3) are known, we investigated whether 
terminal RNAs are commonly processed through the 
canonical miRNA pathway. UnHke other studies (4,36) 
that normalized the data sets using RNAs derived from 
tRNAs and snoRNAs, we used niRNA-derived fragments 
for normalization. We used a normalization strategy (see 
'Materials and Methods' section) based on niRNA frag- 
ments because of our finding that mRNAs but not tRNAs 
and snoRNAs, produce fragments that largely resemble 
random degradation. Since the dependence of shorter 
tRFs (~ 1 5 nt) on Dicer/DGCR8 has not been investigated 
yet, we separately analyzed the terminal RNAs of the two 
different size groups (20—30 and 15— 19nt). As a class, 
each terminal small RNA group manifest statistically 



insignificant {P > 0.05) changes after either dicer 1 
or dgcr 8 knockouts (Supplementary Figure S6A). 
However, it is important to note that specific RNAs 
within tRNAs, snoRNAs and snRNAs do manifest con- 
siderable (>2-fold) difference in both Dicer and DGCR8 
knockdowns (Supplementary Figure S6B). We next tested 
whether the observed asymmetric processing of 5' and 
3' tRFs is maintained during Dicer/DGCR8 deletions. 
To compare asymmetric processing across different 
samples, the processing bias was evaluated in each 
hbrary and for the 20—30 and 15— 19nt size groups 
(Figure 4C). Consistent with the notion that these RNAs 
are largely independent of the RNAi pathway, the 
asymmetry profiles within each group are constant 
across wild-type and dicer l/dgcr8 knockout mES cells, 
yielding Pearson correlation coefficients (r) in the range 
of 0.89-0.97. 

Presence of endogenous Ago2-associated 3' tRFs 
that are capable of cleaving target RNAs 

To further investigate processing and function of tRF, we 
selected two sequences (HisGTG, LeuCAG) for further 



Nucleic Acids Research, 2012, Vol. 40, No. 14 6793 



B 



60000 -f 
„ 50000 
S 40000 
° 30000 - 
I 20000 - 
^ 10000 - 
0 - 




"1 r 

0 20 40 60 j 80 

1 sRNA 5' position in tRNAs 



1 r 

100 



5' end aligned tRNAs 




- I ^M-^sRNA 3' position in tRNAs I 
3' end aligned tRNAs 



1500 -, 



(A 

■a 
a 

0) 



E 

3 



1000 - 



500 



500 



1000 -I 



nli. 



ji 



■ 5' termini 
□ 3' termini 



Individual mouse tRNAs 



10 

5 

0 

-5 

-10. 
10 

5 

0 
-5 
10 



o WT 


15 - 19 nts 




« Dicer KO 






A DGCR8K0 










r(WT vs Dicer KO)=0.89 






r(WT vs DGCR8 KO)=0.95 


20 - 30 nts 






r(WT vs Dicer KO)=0.90 






r(WT vs DGCR8 KO)=0.97 



Individual mouse tRNAs 



Figure 4. Processing of 5'-3' terminal RNA fragments are conserved in mice. (A) Position of the small RNA reads relative to 5' (left panel) and 
3' (right panel) termini of tRNAs. (B) Similar to human 5'-3' terminal tRFs, the mouse tRFs preferentially yield either 5' or 3' fragments. 
(C) Asymmetric bias of abundant (> 10 reads/million) 5' and 3' tRFs is well preserved across different conditions [wild-type. Dicer/DGCR8 
Knock outs (KO)] for each tRNA, with high Pearson correlation coefficients (0.89—0.97). 



in vitro studies (Figure 5). We first analyzed possible Dicer 
requirements for their biogenesis (Figure 5A). Total RNA 
from wild-type (Dicer+/+) and Dicer knock out (Dicer—/—) 
mouse embryonic fibroblasts (MEFs) was used for 
Northern blotting using probes against the HisGTG 
(Figure 5A, upper panel) and the LeuCAG fragment 
(lower panel). Indeed, small RNA production from these 
tRNAs is readily detectable in Dicer-deficient cells, 
indicating that these tRFs are independent of Dicer process- 
ing. To probe into the possibility that some terminal RNAs 



might use the RNAi pathway to inhibit target transcripts, 
we further examined whether tRFs associate with Ago2. 
Endogenous Ago2 was immunoprecipitated using 
anti-Ago2 antibodies and co-immunoprecipitated RNA 
was analyzed by Northern blotting using probes specific to 
HisGTG and LeuCAG (Figure 5B). Both tRFs were 
detected in anti-Ago2 immunoprecipitates demonstrating 
that tRFs can be specifically loaded into Ago protein 
complexes. Although previous studies have indicated that 
Ago2 co-immunoprecipitates with tRF-like fragments (6,8), 
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Figure 5. Presence of Ago2-associated 3' tRFs that can cleave target RNA. (A) Total RNA from Dicer knockout mouse cells (Dicer—/—) or 
wild-type cells (Dicer+/+) were analyzed by northern blotting using probes specific for the 3' of tRNAs (HisGTG and LeuCAG), and miR-125b. 
(B) Endogenous Ago2 was immunoprecipitated from human HEK293 lysates, and the co-immunoprecipitated RNAs were extracted and analyzed by 
northern blots using probes against the 3' tRFs (HisGTG and LeuCAG). GST antibody is used as control (C) Immunoprecipitated Flag-HA-Agol 
and Flag-HA-Ago2 were incubated with a 32P-cap-labeled target RNA (~100nt long), which contained a perfect complementary sequence to the 
3' termini of tRNAs (HisGTG and LeuCAG). Lanes indicated with Tl represent RNaseTl digestions of the RNA substrates as ladders. The RNA 
sequence complementary to tRNA HIS-3' and tRNA LEU-3' is indicated by a black bar on the left. 



it has not been shown yet whether tRFs can cleave target 
RNAs. Indeed, based on the selection of high abundance of 
reads in Ago2-associated data, northern blotting assays 
confirmed that 3' tRFs from two tRNAs are associated 
with endogenous Ago2 (Figure 5B). Analysis using 
Flag-HA-Ago2 complexes also reconfirmed the association 
of these 3' tRFs with Ago2 (Figure 5C). Since Ago2, but not 
Ago 1 cleaves target niRNA (37,38), we immunoprecipitated 
Flag-HA-Agol and Flag-HA-Ago2 complexes from trans- 
fected HEK293 cells using anti-Flag antibodies (Figure 5C) 
and incubated them with a 32P-cap labeled artificial target 
mRNA (~100 nt) containing a region (17-18 nt) fully com- 
plementary to the endogenous 3' tRFs from the two differ- 
ent tRNAs (LeuCAG and HisGTG). Both endogenous 3' 
tRFs directed Ago2-mediated cleavage. In contrast, 
although both 3' tRFs are found in the Agol sequencing 
library (3), the catalytically inactive Agol (38), does not 
co-purify with any cleavage products. These results 
suggest that 3' tRFs from tRNAs can function in an 
RNAi-like pathway. Taken together with previous reports 
using luciferase assays (6), our data support the notion that 



tRFs have the potential to knock down target genes both at 
RNA and protein levels. 

ANG and other RNases can cleave tRNAs to produce 
terminal tRFs 

Since tRFs correspond to the most abundant and diverse 
group of terminal RNAs and are independent of the ca- 
nonical miRNA processors, we sought to understand their 
biogenesis. To test potential candidate proteins involved 
in the production of terminal tRFs, we focused on the 
ANG nuclease (39), which generates 30— 50nt long 
sitRNAs from anti-codon loops during stress (12,18) and 
was reported as a tRNA-specific enzyme (39,40). We 
treated total RNA from 293 cells in vitro with recombinant 
ANG and monitored the production of terminal RNA 
fragments for tRNAs, rRNAs and snRNAs. At the 
initial time point (Oh), endogenous small RNA species 
are present in the absence of ANG treatment (Figure 6A 
and Supplementary Figure S7). Expression of these en- 
dogenous RNA species is enhanced after ANG treatment 
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Figure 6. Multiple RNases generate terminal tRFs from total RNA. (A) ANG (10|.ig/ml) generates ~20nt long-terminal tRF as well as longer 
sitRNAs from total RNA extracts. In comparison an equal concentration of RNase A completely degrades the RNA within an hour, while a control 
lane with solution buffer (Buffer) generate terminal tRFs at much lower levels. (B) In vitro cleavage assay (/ = 1 h) using total RNA from HEK293 
cells (10|rg/lane) and different RNases at a series of concentrations indicate that at low concentrations, both RNase A and RNase I are able to 
produce similar terminal tRFs as that of ANG (arrows). In contrast, RNase Tl generates a different set of 3' tRFs. 



in a time-dependent manner, suggesting ANG is involved 
in generating a variety of tRNA fragments, the sizes of 
wliich correspond to previously reported 30— 50nt 
sitRNAs (12), as well as the ~20nt 5' (Supplementary 
Figure S7) and 3' (Figure 6A) terminal RNAs reported 
in our study. Processing of rRNA and snRNA, by ANG 
seems to reflect random hydrolysis, resulting in a series of 
RNA fragments (Supplementary Figure S8). Since ANG 
belongs to the super family of RNase A (41), we examined 
if RNase A produces terminal tRFs at low concentrations, 
and tested the cleavage pattern of additional RNases. At a 
much (~30x) lower concentration (0.3 |ig/ml) of RNase A 
than that was used for ANG, RNase was able to generate 
both the ~20 and ~25nt tRF bands specifically after 1-h 
digestion (Figure 6B). Intriguingly, RNase I, a bacterial 
RNase that can non-speciflcally digest all di-nucleotide 
bonds (42), also generated similar tRFs at low concentra- 
tions. In contrast, RNase Tl, which cleaves only after 
unpaired G residues (43), generates fragments that are 
different than ANG, and RNase A/I (Figure 6B). These 
observations indicate that multiple RNases expressed in a 
given cell-type are hkely responsible for the diverse range 
of abundant terminal tRFs with varying lengths. 

To further analyze the roles of different tRNA regions 
on tRF production, we designed a series of mutations to 



disrupt specific functional sub-structures of the tRNA and 
monitored the tRNA cleavage patterns generated by ANG 
(Figure 7 and Supplementary Figure S9). The mutations 
were designed by an in silico screening of base substitu- 
tions to yield candidates that preserve secondary struc- 
tures outside the mutated domains. ANG cleavage of 
wild-type tRNAs produced ~20nt long 3' terminal tRFs 
(Figure 7A). Mutations that individually disrupt the 
acceptor stem, the D stem, the anti-codon stem and the 
T\|/C stem were first examined (Supplementary Figure 
S9A— S9D). None of these mutations interfered with the 
production of the terminal tRNA fragments by ANG after 
in vitro cleavage of 0.5 h. Another mutation that com- 
pletely altered the tRNA clover-leaf structure also did 
not visibly affect tRF production (Supplementary Figure 
S9E). However, three different mutants, each involving 
point mutations around the ANG cleavage site on the 
T\|/C loop, resulted in complete disappearance of the 
terminal RNA bands (Figure 7C). Since the in vitro 
transcribed tRNAs do not have modified nucleosides, 
these results also indicate that post-transcriptional modi- 
fications of tRNA are not a requirement for terminal 
RNA processing. In summary, the major determinant in 
processing of terminal small tRNAs by ANG and possibly 
related RNases is the Tv|/C loop within tRNA. 
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Figure 7. ANG processes tRFs within the T\|;C loop. (A) ANG can process in vitro transcribed, synthetic tRNA (untreated/buffer/ANG incubations 
at 0.5 h) to generate terminal tRFs. (B) Secondary structure of the in v/rro-transcribed tRNA, highlighting the three mutations in Tv|;C loop 
(Ml— M3) and the tRF cleavage site (arrow). (C) ANG processing of the tRF is impaired by tRNA mutations in Tv|;C loop (M1-M3). 



Endogenous retroviral replication loci are preferentially 
antisense to 3' terminal tRFs 

The binding of tRNAs to retroviral primer binding sites 
(PBS) facilitates the initiation of retroviral genome reph- 
cation (44,45), such as that for the human immunodefi- 
ciency virus (HIV). Since endogenous human retroviral 
sequences comprise ~7% of the human genome (46), it 
is possible that cells have evolved mechanisms to 
regulate replication of these retroviral elements, particu- 
larly for preventing accidental transcription of deleterious 
retroviral sequences. We therefore investigated whether 
the terminal tRFs have the potential to bind to human 
endogenous retroviruses (HERVs). The majority of 
HERVs exist in the human genome as long terminal 
repeat (LTR) retrotransposons (47) which, together with 
non-viral retrotransposons LlNEs (long interspersed 
nucleotide elements) and SlNEs (smaU interspersed 
nucleotide elements), form roughly 40% of the mamma- 
han genome (48). Terminal 3' tRFs are remarkably more 
complementary to HERV LTRs than to LINEs and 
SINEs (Figure 8A). Furthermore, even though the 
number of distinct 3' and 5' BCPl tRFs differ only by a 
factor of two (1002 versus 507), 3' tRF sequences are 
26-fold (2614 versus 100) more frequently complementary 
to retrotransposon elements than the 5' tRFs; correspond- 
ing to a normalized enrichment of 13-fold. Similarly, 
3' tRFs are 4- to 6-fold more enriched (normalized) to 
complementary sites in LINES and SINEs than 5' tRFs. 
Analysis of the major LTR elements that are complemen- 
tary to BCPl tRFs (Figure SB), indicate that the top 
candidate tRF (LysTTT tRNA, 947 reads) is nearly iden- 
tical to a known antisense DNA oligonucleotide that 
targets the PBS region of HIV and can inhibit HIV repli- 
cation (49). Moreover, the two 3' tRFs (LeuCAG and 
HisGTG tRNAs) that we found to associate with Ago2 
are also capable of degrading their target RNAs, and are 
complementary to the ERVL-E and HERVH LTR 
elements (Figure 5C). 



DISCUSSION 

We present here an in-depth study of terminal-specific 
small RNAs from rRNAs, snRNAs, snoRNAs, tRNAs 
and mRNAs that are generally missed in deep sequencing 
studies because they are commonly filtered out as poten- 
tial degradation products during bioinformatics analysis. 
While such strict filtering greatly helps in focusing on the 
most abundant and likely functional small RNA 
molecules, emerging studies continue to provide evidence 
that degradation-like products can have functional impact 
and are abundant. Most importantly, whether these small 
RNA products are functional or cellular noise, their study 
continue to open up novel cellular mechanisms and inter- 
esting cellular attributes (1^,6-13,16,18,19,30,50-53). 

Our own detailed analyses of small RNAs derived from 
various classes of RNAs reveal novel, biologically import- 
ant features for these small RNAs and support the notion 
that terminal RNAs are not due to artifacts of cloning 
methods used in deep sequencing methods. For clarity, 
we note that the tRFs in this study are derived from 
mature RNAs and do not include tRFs that could be 
also processed from mature RNA precursors such as 
that of pre-tRNAs (6,7). Comparisons across all novel 
and other previously reported classes of tRNA/ 
snoRNA-derived RNAs, (3,4,6-8), indicate that the pro- 
duction of terminal- and 5'-3' end specific small RNAs, 
are common features of constitutively expressed ncRNAs. 
The distinct size distribution of terminal RNAs, as shown 
by sequencing and northern blots, seem atypical for RNA 
turnover products generated by common nuclear proof- 
reading pathways, where mutant or defective RNAs are 
degraded by nuclear exosomes (50,54), or by 5'-3' exo- 
nucleases Rati and Xrnl (52). The extensive identification 
of both 5' and 3' terminal tRFs with similar size distribu- 
tions in independent deep-sequencing studies that include 
Ago bound smaU RNAs, suggests that tRFs exists due to a 
yet unknown terminal-specific degradation mechanism or 
by a processing mechanism that yields functional RNA 
fragments. The phenomena of terminal RNAs is also 
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Figure 8. 3' terminal tRFs might inhibit endogenous retroviral replication. (A) LTRs are highly enriched in complementary sequences of 3' terminal 
tRFs. (B) The major tRNA terminal regions that produce abundant tRFs (reads > 10, column 4) and their complementary LTR elements. (C) A 
potential model for the 3' tRF pathway. The binding of 3' tRF to the transcribed viral RNA could recruit double-stranded RNA specific 
endonucleases such as the highly efficient Ago2, enabling the rapid cleavage of the transcribed endogenous viral RNA. 



present in mice indicative of a conserved terminal RNA 
processing/degradation mechanism. Wliile our in vitro 
cleavage assay results suggest that various RNase 
families including ANG are likely the factors that drive 
tRF production, the association of these RNAs to AGO 
as well as the dearth of mRNA-derived tRFs further seem 
to support the notion that these RNAs could have a bio- 
logical function, perhaps in a coordinated manner (55). 

Why do ncRNAs but not mRNAs manifest terminal- 
specific processing of small RNAs? Since mRNAs are 
comprised of many constitutively expressed RNAs, 
terminal RNA processing is unhkely to be a simple phe- 
nomenon for all constitutively expressed RNAs, and 
provides additional support to the notion that the phenom- 
enon is independent of a random degradation process. We 
also did not detect a preference for terminal small RNAs 



among other well-known ncRNAs such as Y-RNAs (56). 
One possibihty is that terminal RNA processing might be a 
hallmark of RNAs involved in fundamental and ancient 
core processes such as translation. Consistent with the 
notion of terminal RNAs as a process that evolved early, 
we postulate that 3' tRFs may have evolved as RNAi-based 
modulators that block the replication of endogenous retro- 
viral sequences (Figure 8C). Indeed, the RNAi pathway 
evolved as an immune defense mechanism against viruses 
in basal organisms that do not have protein-based adaptive 
immune systems (57). While these concepts support the 
view that terminal RNAs might be limited to core 
ncRNAs as a pervasive phenomenon, we cannot yet rule 
out that other classes of RNAs under various environmen- 
tal conditions (e.g. stress) may also preferentially generate 
such terminal small RNAs. 
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Despite the prominent abundance of tRFs, tlieir precise 
functions remain unknown, yet several observations 
suggest that the terminal tRFs may have a functional role 
in the cell. The depletion of a tRF derived from a pre-tRNA 
is known to increase cell prohferation and elevate the 
number of cells in G2 phase of the cell cycle (7). Similarly, 
the longer class of sitRNAs that are generated by ANG can 
inhibit protein translation (12) to counteract stress, and 
terminal tRFs likely have a similar function. Indeed, the 
general role of small RNAs in stress response programs is 
an emerging theme (58). Clearly, these observations 
warrant mechanistic studies of the underlying pathways. 
Our own observations that link 3' tRFs to HERVs 
provide a testable hypothesis for a specific cellular role 
for 3' terminal tRFs. It is conceivable that in normal cells 
3' tRFs may be able to bind to the PBS region of human 
endogenous retroviruses, blocking their replication though 
endonuclease (e.g. AG02) cleavage of the target transcript. 
Indeed, we found that 3' tRFs are endogenously associated 
with Ago2 and are able to guide Ago2 to cleave the target 
RNA. Detailed experimental testing of this hypothesis 
could take several years of research not only because of 
the complexity of the regulation but also the difficulty in 
working with endogenous viral products. It also remains 
difficult to study these RNAs because common 
technologies such as RT-PCR are either not reliable or la- 
borious (e.g. northern blot) for detecting these RNAs. Our 
own studies would have been considerably more difficult 
than if we did not use recently developed northern blot 
protocols (22) to routinely assay for the expression of 
these small RNAs. It is hkely that emerging third gener- 
ation sequencing technologies and other microfluidic 
technologies, combined with multiplexing, will provide 
methods for cheaper and more robust detection of these 
difficult-to-study small RNAs that are typically smaller 
than even miRNAs (59). 
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